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Abstract 

We consider a MIMO fading broadcast channel and compute achievable ergodic rates when channel state 
information is acquired at the receivers via downlink training and it is provided to the transmitter by channel state 
feedback. Unquantized (analog) and quantized (digital) channel state feedback schemes are analyzed and compared 
under various assumptions. Digital feedback is shown to be potentially superior when the feedback channel uses per 
channel state coefficient is larger than 1. Also, we show that by proper design of the digital feedback link, errors in 
the feedback have a minor effect even if simple uncoded modulation is used on the feedback channel. We discuss 
first the case of an unfaded AWGN feedback channel with orthogonal access and then the case of fading MIMO 
multi-access (MIMO-MAC). We show that by exploiting the MIMO-MAC nature of the uplink channel, a much 
better scaling of the feedback channel resource with the number of base station antennas can be achieved. Finally, 
for the case of delayed feedback, we show that in the realistic case where the fading process has (normalized) 
maximum Doppler frequency shift 0<F<l/2, a fraction 1 — 2F of the optimal multiplexing gain is achievable. 
The general conclusion of this work is that very significant downlink throughput is achievable with simple and 
efficient channel state feedback, provided that the feedback link is properly designed. 

I. Introduction 

In the downlink of a cellular-like system, a base station equipped with multiple antennas communicates with 
a number of terminals, each possibly equipped with multiple receive antennas. If a traditional orthogonalization 
technique such as TDMA is used, the base station transmits to a single receiver on each time-frequency resource 
and thus is limited to point-to-point MIMO techniques [1], [2]. Alternatively, the base station can use multi-user 
MIMO to simultaneously transmit to multiple receivers on the same time-frequency resource. Under the assumption 
of perfect channel state information at the transmitter (CSIT) and at the receivers (CSIR), a combination of single- 
user Gaussian codes, linear beamforming and "Dirty-Paper Coding" (DPC) [3] is known to achieve the capacity 
of the MIMO downlink channel [4], [5], [6], [7], [8]. When the number of base station antennas is larger than the 
number of antennas at each terminal, the capacity of the MIMO downlink channel is significantly larger than the 
rates achievable with point-to-point MIMO techniques [4], [9], [10]. 

Given the widespread applicability of the MIMO downlink channel model (e.g., to cellular, WiFi, and DSL), it 
is of great interest to design systems that can operate near the capacity limit. Although realizing the optimal DPC 
coding strategy still remains a formidable challenge (see for example [II], [12], [13]), it has been shown that linear 
beamforming without DPC performs quite close to capacity when combined with user selection, again under the 
simplifying assumption of perfect channel state information (see for example [14], [15]). 

In real systems, however, channel state information is not a priori provided and must be acquired, e.g., through 
training. Acquiring the channel state is a challenging and resource-consuming task in time-varying systems, and the 
obtained information is inevitably imperfect. It is therefore critical to understand what rates are achievable under 
realistic channel state information assumptions, and in particular to understand the sensitivity of achievable rates 
to such imperfections. To emphasize the importance of channel state information, note that in the extreme case of 
no CSIT at the BS and identical fading statistics (and perfect CSIR) at all terminals, the multi-user MIMO benefit 
is completely lost and point-to-point MIMO becomes optimal [4]. 
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A. Contributions of this work 

The focus of this paper is a rigorous information theoretic characterization of the ergodic achievable rates of a 
fading multiuser MIMO downlink channel in which the UTs and the BS obtain imperfect CSIR/CSIT via downlink 
training and channel state feedback{^ Converse results on the capacity region of the MIMO broadcast channel 
with imperfect channel knowledge are essentially open (see for example [16] and [17] for some partial results). 
Here, we focus on the achievable rates of a specific signaling strategy, zero-forcing (ZF) linear beamforming. 
Consistently with contemporary wireless system technology, we assume that each UT estimates its own channel 
during a downlink training phase and then feeds back its estimate over the reverse uplink channel to the BS. The 
BS designs beamforming vectors on the basis of the received channel feedback, after which an additional round 
of downlink training is performed (essentially to inform the UTs of the selected beamformers). Our results tightly 
bound the rate that is achievable after this process in terms of the resources (i.e., channel symbols) used for training 
and feedback and the channel feedback technique. 

The analysis of this paper inscribes itself in the line of works dealing with "training capacity" [18] of block- 
fading channels. Several previous and concurrent works have treated training and channel feedback for point-to-pont 
MIMO systems (see for example [19], [20], [21], [22], [23], [24], [25]) and, more recendy, for MIMO broadcast 
channels (see for example [26], [27], [28], [29], [30], [31], [32]). However, this paper presents a number of novelties 
relative to prior/concurrent work: 

• Rather than assuming perfect CSIR at the UT's, we consider the realistic scenario where the UTs have imperfect 
CSIR obtained via downlink training. Because the imperfect CSIR is the basis for the channel feedback from 
the UTs, this degrades the quality of the CSIT provided to the BS in a non-negligible manner. 

• Instead of idealizing the feedback channel as a fixed-rate, error-free bit-pipe, we explicitly consider transmission 
from each UT to the BS over the noisy feedback channel. This reveals the fundamental joint source-channel 
coding nature of channel feedback. In addition, this allows us to meaningfully measure the uplink resources 
dedicated to channel feedback and also allows for a comparison between analog (unquantized) and digital 
(quantized) feedback. We begin by modeling the feedback channel as an AWGN channel (orthogonal across 
UTs), and later generalize to a multiple- antenna uplink channel that is shared by the UTs. In this way, we 
precisely quantify the fundamental advantage of using the multiple BS antennas for efficient channel state 
feedback. 

• A fundamental property of the system is that UTs are unaware of the chosen beamforming vectors, because 
the beamformers depend on all channels whereas each UT only has an estimate of its own channel. Several 
previous works (e.g., [26], [33], [34]) have resolved this uncertainty by making the unstated assumption that 
each UT has perfect knowledge of the post-beamforming SINR. In contrast, we make no such assumption and 
rigorously show that this ambiguity can be resolved by an additional round of (dedicated) training. 

• Most prior work has used a worst-case uncorrected noise argument [35] [36] [18] to show that imperfect CSI, 
at worse, leads to the introduction of additional Gaussian noise and thus the achievable rate is lower bounded 
by the mutual information with ideal channel state information and reduced SNR. In our case, however, this 
same argument yields a largely uncomputable quantity and a further step must be taken that yields a tractable 
lower bound in terms of the rate difference between the ideal and actual cases, rather than in terms of a SNR 
penalty. 

• We consider delayed feedback and quantify in a simple and appealing form the loss of degrees of freedom 
(pre-log factor in the achievable rate) in terms of the fading channel Doppler bandwidth, which is ultimately 
related to UT velocity. 

The analysis presented in this paper is relevant from at least two related but different viewpoints. On one hand, 
it provides accurate bounds on the achievable ergodic rates of the linear zero-forcing beamforming scheme with 
realistic channel estimation and feedback. These bounds are useful at any operating SNR (not necessarily large), and 
in subsequent work have been used to optimize the system resources allocated for training and feedback [37][38]. 
On the other hand, it yields sufficient conditions on the training and feedback such that the system achieves the same 
multiplexing gain (also referred to as "pre-log factor", or "degrees of freedom") of the optimal DPC-based scheme 
under perfect CSIR/CSIT. Perhaps the most striking fact about this second aspect is that the full multiplexing gain 

'since this work considers feedback schemes where the role of transmitter and receiver are reversed, we avoid using "transmitter" and 
"receiver" and prefer the use of BS and UT instead, in order to avoid ambiguity. 



3 



of the ideal MIMO broadcast channel can be achieved with simple pilot-based channel estimation and feedback 
schemes that consume a relatively small fraction of the system capacity. Indeed, a fundamental property of the 
MIMO broadcast channel is that the quality of the CSIT must increase with signal-to-noise ratio (SNR), regardless 
of what coding strategy is used, in order for the full multiplexing gain to be achievable [16], [17]. Under the 
reasonable assumption that the uplink channel quality is in some sense proportional to the downlink channel, our 
work shows that this requirement can be met using a fixed number of downlink and uplink channel symbols (i.e., 
system resources used for training and feedback need not increase with SNR). 

When there is a significant delay in the feedback loop, the simple scheme analyzed in this paper does not attain 
full multiplexing gain. However, for fading processes with normalized Doppler bandwidth F strictly less than 1/2, 
we show the achievability of a multiplexing gain equal to M(l — 2F), where M is the number of BS antennas. 
This result follows from a fundamental property of the noisy prediction error of the channel process and is closed 
related to Lapidoth's high-SNR capacity of single-user fading channels without the perfect CSIR assumption [39]. 

The paper is organized as follows. Section |ll] introduces the system model, describes linear beamforming, and 



defines the baseline estimation, feedback, and beamforming strategy. Section III develops bounds on the ergodic 



rates achievable by the baseline scheme. In Section IV we consider an AWGN feedback channel and particularize 
the rate bounds to analog and digital feedback (incorporating the effect of decoding errors for digital feedback), 
and compare the different feedback options. Section [V] generalizes the results to the setting where the feedback 



link is a fading MIMO multiple access channel (MAC). Section VI considers time-correlated fading and the effect 



of delay in the feedback link. Some concluding remarks are provided in Section VII 



II. System Model 

We consider a multi-input multi-output (MIMO) Gaussian broadcast channel modeling the downlink of a system 
where a Base Station (BS) has M antennas and K User Terminals (UTs) have one antenna each. A channel use 
of such channel is described by 

yfc = hfcX + 2;fc, k = l,...,K (1) 

where yk is the channel output at UT k, Zk ~ C?vf(0, A'^o) is the corresponding Additive White Gaussian Noise 
(AWGN), hk € C*^ is the vector of channel coefficients from the k-th UT antenna to the BS antenna array 
(the superscript H refers to the Hermitian, or conjugate transpose) and x is the vector of channel input symbols 
transmitted by the BS. The channel input is subject to the average power constraint E[|xp] < P. 

We assume that the channel state, given by the collection of all channel vectors H = [hi, . . . , hx] S C*^^^, 
varies in time according to a block-fading model [40], where H is constant over each frame of length T channel 
uses, and evolves from frame to frame according to an ergodic stationary spatially white jointly Gaussian process, 
where the entries of H are Gaussian i.i.d. with elements ~ C]Nr(0, 1). Our bounds on the ergodic achievable rate do 
not directly depend on the frame size T; rather, these bounds depend only on whether the training, feedback, and 
data phases all occur within a frame or in different frames. In Sections IV - |V] we consider the simplified scenario 



where the three phases all occur within a single frame (i.e., the channel is constant across the phases) and fading 
is independent across blocks, but we remove these simplifications in Section [Vl] It should also be noticed that the 
rate lower bounds given in the following should be multiplied by the factor (1 — A/T), where A denotes the total 
number of channel uses per frame dedicated to training and feedback. This factor is neglected in this paper since it 
is common to all rate bounds and since A ^ T in a typical slowly-fading system scenario. However, in the general 
case where A is not necessarily small with respect to T, the amount of training and feedback should be optimized 
by taking this multiplicative factor into account. Based on the bounds developed in the present paper, this system 
optimization is carried out in the follow-up works [37], [38]. 

A. Linear beamforming 

Because of simplicity and robustness to non-perfect CSIT, simple linear precoding schemes with standard 
Gaussian coding have been extensively considered: the transmit signal is formed as x = Vu, such that V G £^xK 
is a linear beamforming matrix and u G contains the symbols from K independently generated Gaussian 
codewords. In particular, for K < M Zero-Forcing beamforming chooses the k-th column of V to be a unit 
vector orthogonal to the subspace = spanjhj : j ^ A;}. 
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Fig. 1. Channel estimation and feedback model 

We focus on the achievable ergodic rates under ZF linear beamforming and Gaussian coding. In this case, the 
achievable rate-sum is given by 



max y E 

^E[y,{H)]<p ^ 



^ r / H„ |2 



(2) 



k=l 

2 



where the optimal power allocation is obtained by waterfilling over the set of channel gains {|hjtVfc| : k = 
1, . . . ,K}. Performance can further be improved by using a user scheduling algorithm to select in each frame an 
active user subset not larger than M (if K > M, such selection must be performed if ZF is used). Schemes for 
user scheduling have been extensively discussed, for example in [41], [32], [15], [42]. 

We focus, however, on the case K = M with uniform power allocation (across users and frames: J'fc(H) = -^) 
and without user selection, in which case the per-user ergodic rate is 



Rf{P) = E 



log 1 + 



(3) 



Because is spatially white and is selected independent of (by the ZF procedure), it follows that h^v^ is 
~ C3sf(0, 1). As a result, is the ergodic capacity of a point-to-point channel in Rayleigh fading with average 
SNR j^^, and thus can be written in closed form as [43] Rf = exp(^^)Ei (l, where Ei(n,x) = 

^~t^dt, X > [44]. In the remainder of the paper serves as a benchmark aginst which we compare the 
achievable rates with imperfect CSI. 

This restriction is dictated by a few reasons. On one hand, the case K = M without selection makes closed-form 
analysis (in the presence of imperfect CSI) possible. In addition, the maximum multiplexing gain is M for all 
K > M and hence the case K = M suffices to capture the fundamental aspects of the problem (particularly at 
high SNR). Finally, recent results [33], [45] show that the dependence on CSI quality is roughly the same even 
when user selection is performed. 

B. Channel state estimation and feedback 

We assume that each UT estimates its channel vector from downlink training symbols and then feeds this 
information back to the BS. This scenario, referred to as "closed-loop" CSIT estimation, is relevant for Frequency- 
Division Duplexed (FDD) systems. Our baseline system is depicted in Fig. [T] and consists of the following phases: 

1) Common Training: The BS transmits /3iM shared pilots (/?i > 1 symbols per antenna) on the downlinlij^ 
Each UT k estimates its channel from the observation 



Sfc = ^JlhP hfc + Zfc (4) 



^If /3i is an integer, pilot symbols can be orthogonal in time, i.e., /3i pilots are successively transmitted from each of the M BS antennas 
for a total of /3iA/ channel uses. More generally, it is sufficient for PiM to be an integer and to use a unitary M x /3iM spreading matrix 
as described in [28]; in either case the effective received SNR is Pi^- 
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corresponding to the common training (downlink) channel output, where z.^ ~ C3M"(0, A'^ol). The MMSE 
estimate of given the observation is given by [46]: 

hfc = E[h,sH]E[sfcsH]-is, = -J^^^^k (5) 

The channel can be written in terms of the estimate and estimation noise as: 

hfc = hfe+nfc, (6) 
where is independent of the estimate and is Gaussian with covariance afl with 

2 _ 1 

2) Channel State Feedback: Each UT feeds back its channel estimate to the BS immediately after 
completion of the common training phase. We use H = [hi, . . . ,hi^] E £MxK denote the (imperfect) 
CSIT available at the BS; the feedback is thus a mapping, possibly probabilistic, from to hj.. For now we 



leave the feedback scheme unspecified to allow development of general achievability bounds in Section III 



and particularize to specific feedback schemes from Section IV onwards. 



In Section IV we consider the simplified setting where the feedback channel is an unfaded AWGN channel 
SNR orthogonal across UTs, but in Section jv] we consider the more realistic setting where the uplink 
channel is a MIMO-MAC with fading. Furthermore, the baseline model of Fig. [T] assumes no delay in the 
feedback, i.e., the channel is constant across the training, feedback, and data phases. In Section VI we remove 
this assumption and consider the case where feedback has delay and the channel state changes from frame 
to frame according to a time-correlation model. 

We assume each UT transmits its feedback over /?fbM feedback channel symbols. 

3) Beamformer Selection: The BS selects the beamforming vectors by treating the estimated CSIT H as if it 
was the true channel (we refer to this approach as "naive" ZF beamforming). Following the ZF recipe, is a 
unit vector orthogonal to the subspace 8^ = spanjhj : j / k}. We use the notation V = [vi, . . . , v^]- Since 
K = M and the BS channel estimates hi, . . . ,hM are independent, the subspace is Af — 1 dimensional 
(with probability one) and is independent of hfc. The beamforming vector is chosen in the one-dimensional 
nuUspace of Sy^; as a result is independent of the channel estimate h^ and of the true channel vector h^. 

4) Dedicated Training: Once the the BS has computed the beamforming vectors V, coherent detection of 
data at each UT is enabled by an additional round of downlink training transmitted along each beamforming 
vector. This additional round of training is required because the beamforming vectors {v^} are functions of 
the channel state information {hi, . . . , h^} at the BS, while UT k knows only h^ or, at best, h^ (if error-free 
digital feedback is used). Therefore, the coupling coefficients between the beamforming vectors and the UT 
channel vector are unknown. 

Let the set of the coefficients affecting the signal received by UT k be denoted by 

^fc = {afcj :j = l,...,M} 

where akj = h^Vj is the coupling coefficient between the k-th channel and the j-th beamforming vector. 
The received signal at the A;-th UT is given by 

yk = h^ Vu + Zk = ak^kUk + ^ <^k,jUj + Zk 

= ak^kUk + Ik + Zk (8) 

where the interference at UT k is denoted as 

4 = ^ a-kjUj (9) 

and ak,k is the useful signal coefficient. The dedicated training is intended to allow the estimation of the 
coefficients in Ak at each UT k. This is accomplished by transmitting (32 orthogonal training symbols along 
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each of the beamforming vectors on the downUnk, thus requiring a total of (j2M downlink channel usesj^ 
The relevant observation model for the estimation of Ak is given by 

rk,j = V P2P akj + Zkj, j = l,...,M (10) 

We denote the full set of observations available to UT k as: 

^k = {rk,j-j = l,--.,M}. 

In particular, we shall consider explicitly the case where UT k estimates its useful signal coefficient using 
hnear MMSE estimation based on rk,k, i-c, 

"'^'^ = iVoT^'^'^- ^^^^ 

Because is a unit vector independent of h^, the useful signal coefficient ak,k = h^v^ is complex Gaussian 
with unit variance. As a result we have the representation 

ak,k = ak,k + fk (12) 
where fk and Ofc,fc are independent and Gaussian with variance (t| and 1 — (T2, respectively, with 

5) Data Transmission: After the dedicated downlink training phase, the BS sends the coded data symbols 
tti , . . . , UK for the rest of the frame duration. The effective channel output for this phase is therefore given 
by the sequence of corresponding channel output symbols yk given by ([8]l, and by the observation of the 
dedicated training phase 31^ given by ( [TO) ). 

When considering the ergodic rates achievable by the proposed scheme, we implicitly assume that coding is 
performed over a long sequence of frames, each frame comprising a common training phase, channel state 
feedback phase, dedicated training phase and data transmission. 

We conclude this section with a few remarks. First, we would like to observe that two phases of training, a common 
"pilot channel" and dedicated per-user training symbols is common practice in some wireless cellular systems, as 
for example in the downlink of the 3rd generation Wideband CDMA standard [47] and in the MIMO component 
of future 4th generation systems [48]. Second, we note that an alternative to FDD is Time-Division Duplexing 
(TDD), where uplink and downlink share in time-division the same frequency band. In this case, provided that 
the coherence time is significantly larger than the concatenation of an uplink and downlink slot and hardware 
calibration, the downlink channel can be learned by the BS from uplink training symbols [28], [49]. Although we 



focus on FDD systems, in Remark 4.2 we note the straightforward extension of our results to TDD systems. 



III. Achievable Rate Bounds 

We assume that the user codes are independently generated according to an i.i.d. Gaussian distribution, i.e., the 
input symbols are Uk ~ C!N(0,P/M). The remainder of this section is dedicated to deriving upper and lower 
bounds on the mutual information achieved by such Gaussian inputs, indicated by Rk = I{uk; yk,^k)- 

A. Lower Bounds 

The following lower bound is obtained by using techniques similar to those in [35], [18], [36]. 

Theorem 1: The achievable rate for ZF beamforming with Gaussian inputs and CSI training and feedback as 



described in Section II-B can be bounded from below by: 

\ak,k\^P/{NoM) 



Rk>E 



log 1 + 



(14) 

^ ^ l + aiP/{NoM)+E[\Ik\^\ak,k]/NoJ\ 
Proof: See Appendix |l] □ 



If /S^M is an integer but (32 is not, the unitary spreading approach used for common training can also be used here. 
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The conditional interference second moment E 



l^fcPlflfc/cl in (14 1 may be difficult to compute even by Monte 



Carlo simulation, due to the complicated dependency of Ik on ak^k (this dependence is unknown even if the 
dedicated training is perfect, i.e., ak,k = Ofc.fc)- However, we will not need to compute this explicitly, as is seen in 
our next results. 

A very useful measure is the difference between and R^^, the achievable rate with ZF beamforming and 
ideal CSI defined in Q. The rate gap is defined as follows 



AR ^ Rf 



Rk 



(15) 



and is upper bounded in the following theorem. 

Theorem 2: The rate gap incurred by ZF beamforming with training and feedback as described in Section II-B 
with respect to ideal ZF with equal power allocation is upperbounded by: 

|2 



Ai? < log 1 + ai 



P 



+ 



E[|4 



NqM No 

Proof: See Appendix |ll] 

For clarity of notation, we denote the RHS of the above, referred to as the rate gap upper bound, as AR: 



AR 



(16) 

□ 

(17) 
(18) 



where the latter follows from a simple calculation of E[|/fcp]. The term depends only on dedicated training; 
on the other hand, E [|h^Vjp] is determined by the mismatch between and the BS estimate (because Vj is 
chosen orthogonal to rather than hj^) and therefore depends on the common training and feedback phases. 
An obvious result of the rate gap upper bound is the following lower bound to Rk'. 

Corollary 3.1: The achievable rate for ZF beamforming with Gaussian inputs and CSIT training and feedback 




as described in Section II-B can be bounded from below by: 



Rk > RW 



AR 



(19) 



Because only the estimate of Uk^k is used in the derivation, Corollary 3.1 is also a lower bound to I{uk:,yk,rk,k)- 



B. Upper Bounds 

A useful upper bound to R^ is reached by providing each UT k with exact knowledge of the interference 
coefficients Ak- Thus, this is referred to as the "genie-aided upper-bound". 

Theorem 3: The achievable rate for ZF beamforming with Gaussian inputs and CSI training and feedback is 
upper bounded by the rate achievable when, after the beamforming matrix V is chosen, a genie provides the k-th 
UT with perfect knowledge of the coefficients Ak = {ofcj = hfc : j = 1, . . . , M}: 



Rk<K 



log 1 + 



\ak,k\^P/{NoM) 



l + Ej^kMP/i^oM) 



Proof: Since Oik is a noisy version of Ak, the data-processing inequality yields 

Rk = I{uk;yk,^k) < I{uk;yk,-^k) 



(20) 



(21) 



Because yk conditioned on Ak is complex Gaussian with variance A'^o + YlfLi l^-kjl'^P/M while y^ conditioned 



on {Ak,Uk) is complex Gaussian with variance Nq + X^j^^ \ak,j\'^P/M, we immediately obtain (20 1. □ 

The practical relevance of Theorem |3] is two-fold: on one hand, ( [20] ) is easy to evaluate by Monte Carlo 
simulation]^ On the other hand, this bound can be approached for large /?2, since in this case each UT can accurately 
estimate all interference coupling coefficients and not only the useful signal coefficient. 



''it is usually difficult if not impossible to obtain in closed form the joint distribution of the coefficients Ak- 
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IV. Channel state feedback over an AWGN Channel 

In this section we quantify the rate gap upper bound for different feedback strategies under the assumption that 
the feedback channel is an unfaded AWGN channel with the same SNR as the downlink, i.e., P/Nq, and that 
the UTs access the channel orthogonally. Each UT uses /5fbM feedback channel symbols, and therefore the total 
number of feedback channel uses is /3fbM^. 



A. Analog feedback 

Analog feedback refers to transmission (on the feedback link) of the estimated downlink channel coefficients 
by each UT using unquantized quadrature-amplitude modulation [28], [32], [50], [51]. More specifically, each UT 
transmits on the feedback channel a scaled version of its common downlink training observation s^. defined in (|4]). 
The resulting feedback channel output (BS observation) relative to UT k is given by: 

hfc + ,n , .r = "^k + Wyfc (23) 



hfc + Wfc (24) 



^/WiP 

VPiP + No 

where represents the AWGN noise on the uplink feedback channel (variance Nq) and is the noise during the 
common training phase. The power scaling /3fb corresponds to the number of channel uses per channel coefficient 
(we require /5fb > 1 so that each coefficient is transmitted at least once), assuming that transmission in the feedback 
channel has per-symbol power P (averaged over frames) and that the channel state vector is modulated by a 
/3fbM X M unitary spreading matrix [28]. Because and are each complex Gaussian with co variance NqI 
and are independent, -Wk is complex Gaussian with covariance ct^I with: 

= I' + 1 + P.p/nJ ^^'^ 
The BS computes the MMSE estimate of the channel vector based on as: 



VPiP + No (/?fbP + iVo) 



(26) 



Using (24 1, the channel can be written in terms of the BS estimate and estimation error as: 

hfc = hfc + Bk (27) 



where is independent of the estimate and is Gaussian with covariance a'^I with 



This characterization of (hfc,hfc) can be used to derive the rate gap upper bound for analog feedback: 

Theorem 4: If each UT feeds back its channel coefficients in analog fashion over /?fbM channel uses of an 

p_ 

No' 



AWGN uplink channel with SNR the rate gap upper bound is given by ("AF" standing for Analog Feedback) 



Proof: See Appendix ITlTl □ 

AF 

It is Straightforward to see that AR can be upper bounded as 

Hence, the rate gap is uniformly bounded for all SNRs and therefore the multiplexing gain is preserved (i.e., 
limp^oo \og''p — 1) spite of the imperfect CSI. 
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An intuitive understanding of this rate loss is obtained if one re-examines the UT received signal in the form 
used in Theorem [T] 

yk = ak,kUk+ fkUk +y^(hfcVj)nj +^^^ (31) 

SelfNoise ■Jt^'^' Noise 

V ' 

Interference 

The imperfect channel state information (at the UT and BS) effectively increases the noise from the thermal noise 

AF 

level Nq to the sum of the thermal noise, self-noise, and interference power, and the rate gap upper bound AR 
is precisely the logarithm of the ratio of the effective noise to the thermal noise power. 

Remark 4.1: In many systems, the uplink SNR is smaller than the downlink SNR because UT's transmit with 
reduced power. If the uplink SNR is T rather than , AR^^ is equal to the expression in Theorem ^ with /3fb 
replaced with r/3fb. This does not change the multiplexing gain, but can have a significant effect on the rate gap. 

Remark 4.2: It is easy to see that a TDD system with perfectly reciprocal uplink-downlink channels where 
each UT transmits Ptdd pilots (a single pilot trains all M BS antennas) in an orthogonal manner corresponds 
exactly to an FDD system with perfect feedback (/3fb — > oo) and /3i = Ptdd, because the downlink training in an 
FDD system is equivalent to the uplink training in a TDD system. Therefore, as a byproduct of our analysis, we 
obtain a result for TDD open loop CSIT estimation: 



AI?™-^ = log 



P 1 M- 1 



(32) 



< logll + ^ + ^^^l. (33) 



M/32 M i3tdd 

Dedicated training is necessary even in TDD systems because UT's do not know the channels of other UT's and 
thus are not aware of the beamforming vectors used by the BS. Finally, note that in TDD a total of MPtdd uplink 
training symbols and M/?2 downlink (dedicated) training symbols are needed. 



B. Digital feedback 

We now consider "digital" feedback, where the estimated channel vector is quantized at each UT and represented 
by B bits. The packet of B bits is fed back by each UT to the BS. We begin by computing the rate gap upper 
bound in terms of bits, and later in the section relate this to feedback channel uses. 

Following [21], [20], [19], [26], we consider a specific scheme for channel state quantization based on a 
quantization codebook C = {pi,...,P2b} of unit-norm vectors in C*^. The quantization of the estimated 
channel vector is found according to the decision rule: 



hfc = argmax |h^p| 
pee 



(34) 



and thus is the quantization vector forming the minimum angle with h^. The corresponding S-bits quantization 
index is fed back to the BS. Because is unit-norm, no channel magnitude information is conveyed. 

In [24], [26] it is shown that for a random ensemble of quantization codebooks referred to as Random Vector 
Quantization (RVQ), obtained by generating 2^ quantization vectors independently and uniformly distributed on 
the unit sphere in C^^ (see [26] and references therein), the average (angular) distortion is given by: 

= 2^/3 (2^, \ < 2"^ (35) 



E 



sm 



ht,,hj 



' M- 1 



where is the beta function and sin^ ( h^, h 



Ik J = 1— IP • [^^] assume each UT uses an independently 
generated codebook. For this particular quantization scheme, we can compute the rate gap upper bound: 

Theorem 5: If each UT quantizes its channel to B bits (using RVQ) and conveys these bits in an error-free 
fashion to the BS, the rate gap upper bound is given by ("DF" standing for Digital Feedback): 



AR 



DF 



log 1 + 



P 



1 



+ 



M 



M 



M 



.5 



M 



' M - 1 



(36) 
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Proof: See Appendix ITVl □ 



Using p5\, the rate gap upper bound is further upper bounded as: 



AR 



DF 



< log 1 + 



1 



MP, 



+ 



M-1 1 



M Pi 



+ 



P 

Wo 



(37) 



Comparing this to the rate gap in the analog feedback case (30 1, we notice that the dependence on Pi and P2 are 
precisely the same for both analog and digital feedback. 

The next step is translating the rate gap upper bound so that it is in terms of feedback symbols rather than bits. 
For the time being, we shall make the very unrealistic assumption that the feedback link can operate error-free at 
capacity, i.e., it can reliably transmit log2(l + P/Nq) bits per symbolj^ 

The analog feedback considered before provides a noisy version of the channel vector norm in addition to its 
direction. Although this information is irrelevant for the ZF beamforming considered here, it might be useful in 
some user selection algorithms such as those proposed in [41], [32], [15], [42]. In contrast, digital feedback based 
on unit-norm quantization vectors provides no norm information. Thus, for fair comparison, we assume that PihM 
feedback symbols in the analog feedback scheme correspond to /3fb(M — 1) feedback symbols for the digital 
feedback scheme; i.e., a system using digital feedback could use one feedback symbol to transmit channel norm 
information. An alternative justification for this is to notice that the analog feedback system could be modified 
to operate in /3fb(M — 1) channel symbols by transmitting only the M — 1 relative phases and amplitudes of the 
channel coefficients, since the absolute norm and phase are irrelevant to the ZF beamforming considered here. 

Under this assumption, the number of feedback bits per mobile is B = /3fb(M — l)log2(l + P/Nq). Plugging 
this into ([37]) gives: 



Ar"^ < log 



1 M 



1 1 



M Pi 



+ 



p_ 



1 + 



(38) 



Similar to analog feedback, if /3fb > 1 then the rate gap is upper bounded and full multiplexing gain is preserved. 
However, it should be noticed that for /3fb strictly larger than 1 digital feedback yields a term 



that 



vanishes as P/Nq 00. This should be contrasted with the constant term for the case of analog feedback. 



C. Effects of feedback errors 

We now remove the optimistic assumption that the digital feedback channel can operate error-free at capacity. In 
general, coding for the CSIT feedback channel should be regarded as a joint source-channel coding problem, made 
particularly interesting by the non-standard distortion measure and by the fact that a very short block length is 
required. A thorough discussion of this subject is out of the scope of the present paper and is the matter of current 
investigation (see for example [52], [53]). Here, we restrict ourselves to the detailed analysis of a particularly simple 
scheme based on uncoded QAM. Perhaps surprisingly, this scheme is sufficient to achieve a vanishing rate gap in 
the high SNR region, for an appropriate choice of the system parameters. 

In the proposed scheme, the UTs perform quantization using RVQ and transmit the feedback bits using plain 
uncoded QAM. The quantization bits are randomly mapped onto the QAM symbols (i.e., no intelligent bit-labeling 
or mapping is used). Therefore, even a single erroneous feedback bit from UT k makes the BS's CSIT vector hfc 
essentially useless. Also, no particular error detection strategy is used and thus the BS computes the beamforming 
matrix on the basis of the received feedback, although this may be in error. 

We again let /3fb(M — 1) denote the number of channel uses to transmit the feedback bits (per UT). Interestingly, 
even for this very simple scheme there is a non-trivial tradeoff between quantization distortion and channel errors. In 
order to maintain a bounded rate gap, the number of feedback bits must be scaled at least as (M— 1) log2 ^1 + ~ 
(M — 1) log2 Therefore, we consider sending B = a{M — 1) log2 bits for 1 < a < Pn, in Pfh{M — 1) 
channel uses, which corresponds to log2 bits per QAM symbol. 

^This assumption is unrealistic in ttie context of this model because the feedback channel coding block length is very small and because 
the need for very fast feedback (essentially delay-free) prevents grouping blocks of channel coefficients and using larger coding block length. 
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The symbol error rate for square QAM with q constellation points is bounded by [54]: 

where Q{x) = ^^4^^ *^^^^^^ "-^^ Gaussian probability tail function. Using the fact that q = {P/No)'^fb, we 
obtain the upper bound 

3 / P 



If a = /3fb, which corresponds to signaling at capacity with uncoded modulation, Pg does not decrease with SNR 
and system performance is very poor. However, for a < /3fb, which corresponds to transmitting at a fraction of 
capacity, ^ as ^ cxd. The error probability of the entire feedback message (transmitted in /3fb(M — 1) 
QAM symbols) is given by 

Pe,ft = 1 - (1 - Ps)*^^'''"'^ < /?fb(M - 1)P„ (41) 

where the inequality follows from the union bound. Note the tradeoff between distortion and feedback error: a 
large yields finer quantization but larger Pe,fb, while a small provides poorer quantization but smaller Pe,fb- 

Theorem 6: If each UT quantizes its estimated channel using B = a{M — l)logP/NQ bits (using RVQ), and 
transmits on the feedback link using /9fb(M — 1) channel uses with uncoded QAM modulation, the resulting rate 
gap can be upperbounded by 



:rDF-ERRORS 



1 




< log|l + ^ + (l-Pe,fb)l (^) +^r^ - I +^^'e,fb I , (42) 



where Pg^fb is given by (40 1 and (41 1. 



Proof: See Appendix |V] □ 

If 1 < a < /3fb, then the effect of feedback vanishes as ^ cxd, somewhat similar to the case of error-free 

feedback. This is because the feedback error probability decays exponentially as {P/Nq)^ "tb , so that the term 
^Pe fb vanishes as ^ ^ oo for all a < /3fb, while obviously {P/No)^~°^ vanishes for all a > 1. 

A number of simple improvements are possible. For example, each UT may estimate its interference coefficients 
{(^kj '■ j ¥^ k} from the dedicated training phase, and decide if its feedback message was correctly received or 
was received in error by setting a threshold on the interference power: if the interference power is (Af — 1)P, 
then it is likely that a feedback error occurred. If, on the contrary, it is 2^^/(^^^^)p, then it is likely that the 
feedback message was correctly received. Interestingly, for B = a{M — 1) log2 with a > 1, detecting feedback 
error events becomes easier and easier as -S- increases and/or as the number of antennas M increases. In brief, 

iVo 

for a large number of antennas any terminal whose feedback message was received in error is completely drowned 
into interference and should be able to detect this event with high probability. Assuming that the UTs can perfectly 
detect their own feedback error events as described above, then they can simply discard the frames corresponding 
to feedback errors. The resulting achievable rate in this case is lowerbounded by 



oDF-Errors-Detect \ /I D \ 



, / 1 M- 1 1 / P 



(43) 



in light of (38 1 (after replacing a instead of /?fb) and of Corollary 3.1 Note that this rate lies between the achievable 



rate lower bound obtained via the rate gap in ( 42 1 and the genie-aided upper bound from Theorem [3] 



Remark 4.3: It is interesting to notice that feedback errors make the residual interference behave as an 
impulsive noise: it has very large variance with small probability Pg^fb- It is therefore clear that detecting the 
feedback errors and discarding the corresponding frames yields significant improvements. Using this knowledge at 
the receiver (as in the rate bound ([43])), avoids the large "Jensen's penalty" incurred by the rate gap in (42 1, where 



the expectation with respect to the feedback error events is taken inside the logarithm. 
Remark 4.4: We notice here that the naive ZF strategy examined in this paper is robust to feedback errors 
in the following sense: the residual interference experienced by a given UT depends only on that particular UT 
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feedback error probability. Therefore, a small number of users with poor feedback channel quality (very high 
feedback error probability) does not destroy the overall system performance. This observation goes against the 
conventional wisdom that feedback errors are "catastrophic". 



D. Comparison between analog and digital channel feedback 

Based upon the bounds developed in the previous subsections as well as the genie-aided upper bounds (computed 
using Monte Carlo simulation) we can now compare analog, error-free digital, and QAM-based digital feedback. 
Because the effect of downlink and common training is effectively the same for all feedback strategies, we pursue 
this comparison under the assumption of perfect CSIR, i.e., perfect common and dedicated training corresponding 



to /3i = /32 — > oo. From (30 1 and (38 1 we have 



Aii^liR < log(l + -^) (44) 



Ai?g|iR < log 



p 
p 



1 + — (45) 



If /?fb = 1 then analog and error-free digital feedback both achieve essentially the same rate gap of 1 bit per channel 
user (per UT). However, if /?fb > 1, the rate gap for quantized feedback vanishes for — > oo. This conclusion 
finds an appealing interpretation in the context of rate-distortion theory. It is well-known (see for example [55] and 
references therein) that "analog transmission" (the source signal is input directly to the channel after suitable power 
scaling) is an optimal strategy to send an i.i.d. Gaussian source over a AWGN channel with the same bandwidth 
under quadratic distortion. In our case, the source vector is (Gaussian and i.i.d.) and the feedback channel is 
AWGN with with SNR Hence, the fact that analog feedback cannot be essentially outperformed for /3fb = 1 
is expected. However, it is also well-known that if the channel bandwidth is larger than the source bandwidth 
(which corresponds to the case where a block of M source coefficients are transmitted over /3fbM channel uses 
with /3f|-, > 1), then analog transmission is strictly suboptimal with respect to a digital scheme operating at the 
rate-distortion bound, because the distortion with analog transmission is 0{{P / Nq)^^) whereas it is 0{{P /Nq)~^^) 
for digital transmission. 

This conclusion is confirmed by the numerical results shown in Figures [2] and [3] In Figure [2] the lower and 
genie-aided upper bounds are plotted for analog feedback, digital feedback without error, and digital feedback with 
error (QAM) versus SNR for an M = 4 system with /3fb = 1. For digital feedback with error, the error detection 



bound in (43 1 is also included. The analog and error- free digital feedback schemes perform virtually identically 
and achieve a rate approximately 3 dB away from the perfect channel state information benchmark. Note also 
that the gap between the upper and lower bounds is not very large. For digital feedback with uncoded QAM |^ 
however, there is a substantial gap between the upper and lower bounds; this gap and the performance with error 
detection is explained by Remark 4.3 In Figure |3] only the genie-aided upper bounds are plotted (because the 



lower and upper are nearly identical and thus are difficult to distinguish) for the same setting with /3fb = 2. We 
see that digital feedback with uncoded QAM outperforms analog feedback above approximately 5 dB, and that 
the rate with digital feedback (with or without errors) converges to the ideal rate as predicted earlier. This figure 
confirms that the effect of feedback vanishes when digital feedback is used, with or without errors, and /?fb > 1. 
Finally, in Figure 4 the bounds are plotted as a function of /3fb for fixed SNR = 10 dB and = 20 dB. When 



/3fb ~ 1 analog ana error-free digital feedback are nearly equivalent, but as /3fb is increased the rate with error- free 
digital quickly approaches the perfect channel state information rate. When feedback errors are introduced, digital 
feedback does eventually outperform analog and also approaches the ideal rate, but a larger /3fb is required. It is 
also worth noticing that as the SNR is increased, the value of /3fb at which digital (with or without errors) begins to 
outperform analog decreases toward 1: this is to expected based upon the fact that the effect of feedback vanishes 
as ^ oo for any /?fb > 1 for digital, whereas it does not for analog feedback. 

It is worth noting that the same basic conclusion, i.e., that digital feedback (with or without errors) outperforms 
analog for sufficiently large /3fb, also holds in the presence imperfect CSIR. However, because imperfect CSIR 

^These results are obtained by optimizing tlie value of 1 < a < /3fb for each SNR. We refer to tiiis as "envelope", that is, the plotted 
curve is the pointwise maximum of the rate vs. SNR curves for all a. 
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SNR [dB] 

Fig. 3. Achievable rate upper bounds for analog, error-free digital, and QAM-based digital feedback for M — 4 and /3fb = 2. 



leads to a residual term in the rate gap expression that does not vanish (even for large -^), the absolute difference 
between digital and analog feedback is reduced. 

V. Channel state feedback over the MIMO-MAC 

Orthogonal access in the feedback link requires O(M^) channel uses for the feedback, while the downlink 
capacity scales at best as 0{M). When the number of antennas M grows large, such a system would not scale well 
with M. On the other hand, the inherent MIMO-MAC nature of the physical uplink channel suggests an alternative 
approach, where multiple UT's simultaneously transmit on the MIMO uplink (feedback) channel and the spatial 
dimension is exploited for channel state feedback too. This idea was considered for an FDD system in [28] and 
analyzed in terms of the mean square error of the channel estimate provided to the BS. 

As in [28], we partition the AI users into ^ groups of size L, and let UTs belonging to the same group transmit 
their feedback signal simultaneously, in the same time frame. Each UT transmits its M channel coefficients over 
PihM channel uses, with /?fb > 1. Therefore, each group uses /3fbM channel symbols and the total number of 
channel uses spent in the feedback is (3{h^^- Choosing L oc M (e.g., L = M/2) yields a total number of feedback 
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channel uses that grows Unearly with M, such that the feedback resource converges to a fixed fraction of the 
downlink capacity. We assume that the uplink feedback channel is affected by i.i.d. block fading (i.e., has the same 
distribution as the downlink channel) and that there is no feedback delay. 

With respect to the analysis provided in [28], the present work differs in a few important aspects: 1) we consider 
both analog and digital feedback; 2) although our analog feedback model is essentially identical to the FDD scheme 
of [28], we consider optimal MMSE estimation rather than Least-Squares estimation (zero-forcing pseudo-inverse); 
3) we put out results in the context of the rate gap framework, that yields directly fundamental lower bounds on 
achievable rates, rather than in terms of channel state estimation error. 



A. Analog Feedback 

In an analog feedback scheme, each UT feeds back a scaled noisy version of its downlink channel, given 



by 

the symmetry of the problem, we can focus on the simultaneous transmission of a single group of "UTs. Let 
A = [ai • • • a^] G £MxL (jgjjQtg uplink fading matrix for this group of UTs (with i.i.d. entries, 
and let f or A; = 1 , . . . , L 



Sfc where is the observation provided by the common training phase, defined in (4 1. Due to 

UTs. Let 

mo, 1)) 



bk, 



Sk,j 



V/3iP + No 



hk,j + 



V/?iP + No 



Zk' 



(46) 



denote the transmitted symbol by UT k for its j-th channel coefficient, where Skj is the j-th component of and, 
from Q, Zkj is the common training AWGN. For simplicity, we assume that the BS has perfect knowledge of the 
uplink channel state A; we later consider the more general case and see that the main conclusions are unchanged. 

The Af-dimensional received vector gj, upon which the BS estimates the j-th antenna downlink channel coeffi- 
cients hi J, . . . ,hL,j of all users in the group, is given by: 



Si 



L 
i=l 



(47) 



where Wj is an AWGN vector with i.i.d. elements 



Q'J^{0,No). From the i.i.d. jointly Gaussian statistics of the 
channel coefficients, downlink noise and uplink (feedback noise) it is immediate to obtain the MMSE estimator for 
the downlink channel coefficient , in the form 



T'kJ 



c a?. 



/JfbPAAH + Nol 



(48) 
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where we define the constant c 
given by 



= 



. The corresponding MMSE, for given feedback channel matrix A, is 



1 - c^a^ 



/^fbPAA^ + iVoI 



-1 



afc 



(49) 



Theorem 7: If each UT feeds back its channel coefficients in analog fashion over the MIMO MAC uplink channel, 
with groups of L users simultaneously feeding back and /3fbM channel uses per group, the rate gap upper bound 
is given by: 



AF 

MIMO-MAC 



log 1 + 



1 



_P 



+ 



M 



M 



1 P 



1 



1 + 



No 



+ 



ftp 
No 



1 + 



No 



mmse 



iVn 



where we define the average channel state information estimation MMSE as 



1 



L 



mmse(p) = ^ E 

k=l 



1 



(50) 



(51) 



and where {Ai, . . . , A/,} denote the eigenvalues of the L x L central Wishart matrix A'^A. 
Furthermore, if L < M the rate gap is bounded and converges at high SNR to the constant 



lim AR 



P/No- 



AF 

MIMO-MAC 



log 1 + 



1 



(32M 



+ 



M - 1 / 1 



M 



+ 



1 



/3i /3fb(M-L) 



(52) 



Proof: See Appendix VI □ 



Comparing this expression to the rate gap for analog feedback over an AWGN channel ( |30| ), we notice that an 
SNR (array) gain of M — L is achieved (on the feedback channel) when the feedback is performed over the MIMO 
MAC because the feedback (of L users) is received over M antennas]^ In addition, a factor of L fewer feedback 
symbols are required when the feedback is performed over the MIMO MAC (/3fb^ vs. /JfbM^). On the other 
hand, using the second line of the RHS of (l89ll in Appendix |llTl it is immediate to show that for L = M the rate 
gap upper bound grows unbounded as log log 



in Appendix III it is immediate to show that for L 

P \ 



From (52 1 we can optimize the value of L (assuming L < M) for a fixed number of feedback channel uses, 
which we denote by aM for some a > 2 (if L < M there must be at least two groups and thus we must have 



at least 2M feedback symbols). By letting aM = /3fb^, we obtain /3fb 
minimizing the rate gap bound is equivalent to maximizing the term L{M_ 
the optimal group size is given by L 



Substituting this value in (52i yields 



Using this in (|52j), we have that 
L) for fixed M and L < M. Therefore, 



'M 



log 1 + 



1 2(M - 1) M - 1 1 



+ 



+ 



(53) 



and the corresponding total number of feedback symbols is 2/3fbM. Interestingly, we notice that in the regime of 
large M the term that dominates the optimized rate gap bound ( [53| ) corresponds to the downlink common training 
phase. In fact, the terms corresponding to dedicated training and feedback vanish as M increases. 

When the total number of feedback symbols is larger or equal to 2M (i.e., a > 2) numerical results verify that also 
at finite SNR the choice L* = ^ yields the best performance both in terms of the achievable rate lower bound and 
of the the genie-aided upper bound. Hence, the optimal MIMO-MAC feedback strategy is a combination of TDMA 
and SDMA. In contrast, when total number of feedback symbols is strictly smaller than 2M (i.e., 1 < a < 2), 
choosing L = M with /3fb = a is the only option. Although this choice yields an unbounded rate gap, it does 
provide reasonable performance at finite SNR's. 

A legitimate question at this point is the following: is the condition L < M a fundamental limit of the MIMO- 
MAC analog feedback in order to achieve a bounded rate gap, or is it due to the looseness of Theorem |2]? In order 
to address this question, we examine the genie-aided rate upper bound of Theorem |3] and obtain the following rate 
upper bound: 

^At high SNR the feedback from a particular UT is effectively received over an interference-free 1 x (M — L + 1) channel because L—1 
interfering signals are nulled. However, this results in only a M — L multiplicative gain because ]E[l/x2fc] ~ ^ !)■ 
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Theorem 8: When a group of L = M UTs feed back the channel coefficients simultaneously over Pf^M channels 
uses of the fading MIMO-MAC, the difference between and the genie-aided upper bound of Theorem [s] is 
uniformly bounded for all SNRs. 



Proof See Appendix VII □ 
Theorem [8] suggests that if the UTs are able to obtain an estimate of their instantaneous residual interference 
level in each frame, up to M UTs can feedback their channel state information at the same time. The ability of 
estimating the interference coefficients Ak (see ([8]) and the comment following Theorem |3]l depends critically on 
the quality of the dedicated training. Hence, the dedicated training has a direct impact on the design and efficiency 
of the channel state feedback. Such inter-dependencies between the different system components can be illuminated 
thanks to the comprehensive system analysis carried out in this work and are missed by making overly simplifying 
assumptions (e.g., genie-aided coherent detection with perfect knowledge of the coefficients Ak)- 



Remark 5.1: In [28], the same model in (46l for analog channel state feedback over the MIMO-MAC uplink 
channel is considered. Instead of the linear MMSE estimator considered here, a zero-forcing approach (via the 
pseudo-inverse of the matrix A) is examined. In the case of L = M this yields an infinite error variance, which 
does not make sense in light of the fact that each channel coefficient has unity variance. This odd behavior can be 
avoided by performing an additional component-wise MMSE step. As a matter of fact, performance very similar 
to what we have found for the full MMSE estimator can be obtained for L < M by using a zero-forcing receiver 
for the channel state feedback, followed by individual (componentwise) MMSE scaling. We omit the analysis of 
such suboptimal scheme for the sake of brevity. 

Remark 5.2: It is also possible to analyze the more realistic scenario where the uplink channel matrix A 
is known imperfectly at the BS. We consider the following simple training-based scheme: the L UTs within a 
feedback group transmit a preamble of (3upL training symbols, where /3up > 1 defines the uplink training length 
(per UT). Without repeating all steps in the details, the uplink channel A admits the following decomposition: 

A = A + A (54) 

where the channel estimate and estimation error (A, A) are jointly jointly Gaussian and independent, with per- 
component variances 1 — and cr^ ), respectively, with (T^„ = ^ Now, the MMSE estimation of the 

downlink channel coefficients hkj is conditional with respect to A. By repeating all previous steps, after a lengthy 
calculation that we do not report here for the sake of brevity, we obtain the average estimation error in the form 

TU-[ 2/AM , ^'^^ ( I^^PWo /3fb-P\ 

Ekfc(A)] = p + ^ ^"p mmse P 2tr p N ^^^^ 



where mmse(-) was defined in (51 1. By comparing (55 1 with (87 1, we notice that they differ only in the argument 
of the function mmse(-). The two expressions coincide for /3up — > cxd, consistent with the fact that /3up oo 
corresponds to perfect estimation of the channel matrix A. Furthermore, for large SNR, the two arguments differ 
by a constant multiplicative factor. Hence, apart from this constant factor that depends on the uplink training 
parameter /?up, the conclusions about the rate gap obtained for the case of perfect uplink channel knowledge also 
hold for the case of training-based uplink channel estimation. 

B. Digital Feedback 

In the case of digital feedback, we let L < M UTs multiplex their channel state feedback codewords at the same 
time. The resulting MIMO-MAC channel model is again given by (47 1, but now the vector bj contains the j-th 
symbols of the feedback codewords of the L UTs sharing the same feedback frame. As in Section [rV-B[ we assume 



that feedback messages of a(M — 1) log2 ^ bits are sent in /?fb(M — 1) channel uses. Hence, the feedback symbols 
transmitted by the L UT's can be grouped in a L x /3fb(Af — 1) matrix, while the BS has an M x /3fb(Af — 1) 
observation upon which to estimate the transmitted symbols. We again assume each feedback symbol has average 
energy P. 

Suppose that the BS receiver operates optimally, by using a joint ML decoder for all the simultaneously 
transmitting UTs. The high-SNR error probability performance of the MIMO-MAC channel was characterized 
in terms of the diversity-multiplexing tradeoff in [56]. In particular, when each user transmits at rate rlog2 
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bits/symbol (i.e., with multiplexing gain r) over the MIMO-MAC with i.i.d. channel fading (as considered here), 
the optimal ML decoder achieves an individual user average error probability 



where the "dot-equality" notation, introduced in [57], [56], indicates that limp/TVo^oo ~\og%^No ~ d*{r). The error 
probability SNR exponent d*{r) is referred to as the optimal diversity gain of the system. Particularizing the results 
of [56] to the case of L < M users with 1 antenna each, transmitting to a receiver with M antennas, the optimal 
diversity gain is given by 

« , , f M(l - r) for < r < 1 

d*(r) = <^ ^ ^ ^ , T - (56) 
^ ^ \ otherwise 

This is the same exponent of a channel with a single user with a single antenna, transmitting to a receiver with M 
antennas (single-input multiple-output with receiver antenna diversity). In other words, under our system parameters, 
each UT achieves an error probability that decays with SNR as if TDMA on the feedback link was used (as if 
the UT transmitted its feedback message alone on the MIMO uplink channel). From what is said above, it follows 
that the multiplexing gain of all UTs is given by r = Furthermore, from the derivation of Section 



IV-C 



we 

require that 1 < a < /?fb. It follows that the average feedback error message probability in the MIMO-MAu lading 
channel is given by 

where g{x) is some sub-polynomial function, such that lima;^oo x~'^g{x) = for all fixed e > 0. 

If we examine the rate-gap expression with digital feedback ( [42] ), we see that in order to achieve a bounded 

rate gap the error probability Pe,fb must go to zero at least as fast as ( j . From ( 57 1 we have that for all 



1 < a < /?fb such that M(l — a/Pfh) is strictly larger than 1, the resulting rate gap isbbunded and the effect 
of feedback errors vanishes. This imposes the condition /3fb > jjz^ and a < ^j^^ Pit,, which is stricter than the 
condition /3fb > 1 and a < /3fb needed in the case of TDMA an unfaded feedback channel previously analyzed in 
Section irV-O 

We conclude that a bounded rate gap can also be achieved with digital feedback on the MIMO-MAC uplink 
channel. Therefore, also in this case we can achieve a number of feedback channel uses that scales linearly with 
the number of the BS antennas M. Explicit design of codes that achieve the optimal divesity-multiplexing tradeoff 
of MIMO-MAC channels is not an easy task in general. In the particular case of M users with one antenna each, 
simple explicit constructions of MIMO-MAC codes for the digital channel state feedback are presented [53]. These 
codes can be optimally decoded by using a Sphere Decoder [58], [59] and achieve the performance promised by 
the above analysis. It should be noticed, however, that while in the AWGN case the term -^^Pe^tb in the rate gap 
expression vanishes rapidly (faster than polynomially, in P/Nq), in the MIMO-MAC fading case it vanishes only 
as (P/A^o)^^^''^ Thus, for finite SNR the rate gap may be significantly larger than in the case of unfaded 

feedback channel and the optimal tradeoff between quantization distortion and the feedback error probability must 
be sought by careful optimization of the parameters a and /3fb (see details in [60]). Also, the same observations 
about detecting feedback errors at the UTs and discarding the corresponding frames made at the end of Section 
IV-C| apply here. 



C. Numerical example 

Fig. [5] shows both the genie-aided upper bound of Theorem |3] and the lower bound based on ( 50 1 of analog 



feedback over a fading MIMO-MAC for M = 4 and L = 2,4. We assume perfect CSIR. We notice that for 
L = 2, the lower bound coincides with the genie-aided upper bound and comes very close to the performance of 
ZF with ideal CSIT. For L = M, the rate gap of the lower bound ( [50| ) is unbounded but the double logarithmic 
growth loglog(P/A'^o) yields a very small gap for a wide range of practical SNRs. The genie-aided bound achieves 
a constant rate gap even for L = M, in accordance with Theorem [8] Although not shown here, a system using 
M = 4, L = 2 and /3fb = 1 does outperform M = L = 4, /3fb = 2 (both configurations use a total of 8 feedback 



18 



Analog feedback with M=4 




SNR [dB] 

Fig. 5. Impact of L with analog feedback over MIMO-MAC 

symbols per frame) in terms of the lower bound and the genie-aided upper bound throughout the SNR range shown; 
this validates our earlier claim about the optimality of L = ^ whenever at least 2M feedback symbols are used. 

Fig. [6] compares the achievable rates of analog and digital feedback schemes based on the rate gap (50), (42), 
over a fading MIMO-MAC for M = 4. For the digital feedback we assume that there exist some code achieving the 
outage probability (57) with g{P/NQ) = 1. We compare both schemes for the same total amount of the feedback 
symbols (24 symbols). For the analog feedback we choose L = 2,/3fb = 3, while for the digital feedback we let 
L = 4, /3fb = 8, a = 4. We observe that the digital feedback achieves near-optimal sum rate over the all SNR ranges 
while the analog feedback achieves a constant gap of roughly 0.7 bit/channel use. Surprisingly, the digital feedback 
is able to let M users transmit simultaneously while vanishing both the quantization error and the feedback error. 

VI. Effects of CSIT feedback delay 

In this section we wish to take into account the effect of feedback delay in a setting where the fading is 
temporally correlated. We assume that the fading is constant within each frame, but changes from frame to frame 
according to a stationary random process. In particular, assuming spatial independence, each entry of evolves 
independently according to the same complex circularly symmetric Gaussian stationary ergodic random process, 
denoted by {h{t)}, with mean zero, unit variance and power spectral density (Doppler spectrum) denoted by Sh{0^ 
^ G [—1/2, 1/2], and satisfying /^(/2 ^hiOdC = 1> Notice that the discrete-time process {h{t)} has time that ticks 
at the frame rate. 

Because of symmetry and spatial independence, we can neglect the UT index k and the antenna index and 
consider scalar rather than vector processes. Generalizing Q, the observation available at each UT at time t — d 
from the common training phase takes on the form 

|s(t - r) = y%Ph{t - t) + z{t - t) : T = d,d + l,d + 2, ... ,oo^ (58) 

where d indicates the feedback delay in frames. This means that the channel state feedback to be used by the BS 
at frame time t is formed from noisy observations of the channel up to time t — d. We consider a scheme where 
each UT at frame t — d produces the MMSE estimate of its channel at frame t and sends this estimate (using either 
analog or digital feedback) to the BS; the BS uses the received feedback to choose the beamforming vectors used 
for data transmission in frame t. 
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A. Estimation Error at UT 

The key quantity in the associated rate gap is the MMSE prediction error at the UT. Let h{t) denote the MMSE 



estimate of h{t) given the observations in (58 1. Given the joint Gaussianity of h and s, we can write 

h{t) = h{t) + n{t) (59) 

where E[|n(t)p] = al is the estimation MMSE, and h{t) and n{t) are independent with E[|/i(t)p] = 1 — af. From 
classical Wiener filtering theory [46], the one-step prediction (d = 1) MMSE error is given by 

ei{S) = exp y'J^ log(5 + 5^(0)^?^ - S (60) 

where 6 = Nq/{(3iP) is the observation noise variance. The filtering MMSE {d = 0) is related to ei{5) as 

The scenario considered in all previous sections corresponds to i.i.d. fading (across blocks) and d = 0, in which 
case ei(5) = 1 (past observations are useless) and thus cj^ = eo((J) = (1 + which coincides witl 

More in general, in this section we shall consider cr^ = €^{5) for d = 0, 1. 

We distinguish two cases of channel fading statistics: Doppler process and regular process: 
• We say that {h{t)} is a Doppler process if Sh{$) is strictly band-limited to [—F,F], where F < 1/2 is the 
maximum Doppler frequency shift, given by F = ^T/, where V is the mobile terminal speed (m/s), fc is 
the carrier frequency (Hz), c is light speed (m/s) and Tf is the frame duration (s) [40]. A Doppler process 
satisfies /j^^ log S'/j(^)(i^ > — oo, and has prediction erroi^ 

ei{5) = (^1-2^ exp (^j^ ^\og{5 + Shimdj - 5 (62) 

*We focus on the case d = 1, because it is very relevant in practical applications. For example, high-data rate downlink systems such as 
IxEv-Do [61] already implement a very fast channel state feedback with at most one frame delay. Furthermore, the one-step prediction case 
allows an elegant closed-form analysis. 

'As in [39], the same result holds for a wider class of processes such that the Lebesgue measure of the set £ [—1/2, 1/2] : Sh{£,) = 0} 
is equal to 1 — 2F, and such that log(Sh (^))ci^ > — oo where 2) is the support of Sh{£,)- 
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Therefore, lim^^o ei(<^) = 0. 
• We say that {h{t)} is a regular process if ei(0) > (see [39] and references therein). In particular, a process 

1 /2 

satisfying the Paley-Wiener condition [46] J_y2^'^S '^hiO^^^ > is regular. 

For the case of no delay (d = 0), for either type of process the estimation error goes to zero with the observation 
noise, i.e., eo{5) ^ as 5 — > 0. However, they differ sharply in terms of prediction error: ei{6) is strictly positive 
for a regular process (even as 5 — > 0), whereas ei{6) for Doppler processes as quantified in the following: 

Lemma 1: The noisy prediction error of a Doppler process satisfies 



for (5 I 0, where k is a constant term independent of 6. 



(63) 



Proof: Applying Jensen's inequality to (62i from the fact that / ShiOd^, = 1, we obtain the upper bound 

2F 

+ 6] -5'^ 



ei {5) < 5 



1-2F 



1 

2F 



Using the fact that log is increasing, we arrive at the lower bound 

ei (6) > S'-'^ 

Combining these bounds we obtain the result. □ 



(64) 



(65) 



B. Rate Gap Upper Bound 

When analog feedback is used, each UT transmits a scaled version of its MMSE estimate h{t) over the feedback 
channel. The only difference from the scenarios studied in Sections V-A| (AWGN feedback channel) and |IV-A 



(MIMO MAC feedback channel) is that the estimation error at the UT is {No/{f3iP)) rather than {I+PiP/Nq j '. 
As a result, a simple calculation confirms that the expressions for the rate gap upper bound given in Theorems [4] 
(AWGN) and|7|(MIM0-MAC) apply to the present if {No/{piP)) is substituted for (1 + /3iP/iVo)"^ The same 
equivalence holds for digital feedback: each UT quantizes its MMSE estimate h{t), and as a result the rate gap 
upper bound given in Theorem [5] applies with the same substitution. For the sake of brevity, the expressions for 
the rate gap upper bound are not provided here. 

In fact, the effect of feedback delay is most clearly illustrated by considering perfect feedback (i.e., /3fb —>■ oo), 
in which case (at frame t) the BS has perfect knowledge of h{t), the UT's prediction of h{t) based on common 
training observations up to frame t — d. For the sake of simplicity we further assume perfect dedicated training 
(i.e., P2 00), in which case the rate gap upper bound is 



log 1 + 



P M -1 



iVo 



(66) 



No M \PiP, 

We now analyze the cases of no delay and one-step delay for both types of processes. 

a) No feedback delay (d = 0): Because using past observations can only help, the filtering error is no larger 
than the error if the past is ignored, i.e., eo{6) < (1 + PiP/Nq)"^. It thus follows that for both Doppler and 
regular processes the rate gap is bounded. Based upon (61 1, Lemma [T| and the property eo(0) > for regular 
processes, it is straightforward to see that {P/NQ)eo {Nq/{(3iP)) ^ ^ as ^ ^ cxd for either regular or Doppler 



processes. As a result, the rate gap upper bound in (661 converges to log(l + at high SNR. This matches the 



high SNR expression for block-by-block estimation in (30 1, showing that filtering does not provide a significant 
advantage at asymptotically high SNR. However, as later illustrated through numerical results, this convergence 
occurs extremely slowly for Doppler processes or highly correlated regular processes, in which case filtering does 
provide a non-negligible gain over a wide range of SNR's. 
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b) Feedback delay (d = 1): For regular fading process, since ei(0) > 0, the quantity {P/Nojei {Nq/{PiP)) 
increases linearly with and thus the rate gap upper bound AR grows like log As a result, the achievable 

rate lower bound R^^ — AR^^^°^" is bounded even as P/Nq — > oo. In addition, in Appendix 



VIII 



we show that the 

genie-aided upper bound is also bounded due to the fundamentally non-deterministic nature of regular processes. 
This shows that with delayed feedback and a channel that evolves according to a regular fading process, a system 
that makes use of zero-forcing naive beamforming to M users becomes interference limited. This behavior holds 
even with CSIR (i.e., letting /5i — > oo). 

Fortunately, physically meaningful fading processes belong to the class of Doppler processes, at least over a 
time-span where they can be considered stationary. For a practical relative speed between BS and UT, such time 
span is much larger than any reasonable coding block length. Hence, we may say that Doppler processes are more 
the rule than the exception. In this case, the system behavior is radically different. Using Lemma [TJ at high SNR 
the rate gap upper bound is 




and thus the rate gap grows like 2Flog Using this in the rate lower bound of Corollary (3.1 1, and considering 
the pre-log factor in high-SNR, we have that the system sum-rate is lowerbounded by 

M 

Vi?,. >M(l-2F)log— + 0(1) (68) 

This shows that a multiplexing gain of M(l — 2F) is achievable. 

Remark 6.1: If perfect CSIR is assumed, an interesting singularity is observed for Doppler processes. Under 
this assumption each UT is able to perform perfect prediction of its channel state on the basis of its past noiseless 
observations of the channel, by the definition of a Doppler process. Thus, it is as if there is no delay and the full 
multiplexing gain of M is achieved (even if the feedback link is imperfect). On the other hand, if perfect CSIR 
is not assumed and UT's learn their channel through /3iM common training symbols, for any finite value of /5i a 
multiplexing gain of only M(l — 2F) is achieved. This point illustrates, again, that neglecting some system aspects 
may yield to erroneous conclusions. In this case, by properly modeling imperfect CSIR we have illuminated the 
impact of the UTs speed (which determines the channel Doppler bandwidth ) on the system achievable rates in a 
concise and elegant way. 

Remark 6.2: It is interesting to notice here the parallel with the results of [39] on the high-SNR capacity of 
the single-user scalar ergodic stationary fading channel with no CSIR and no CSIT, where it is shown that for a 
class of non-regular processes that includes the Doppler processes defined here, the high-SNR capacity grows like 
li\og{P/NQ), where L is the Lebesgue measure of the set G [—1/2, 1/2] : Sh{i) = 0}. In our case, it is clear 
that L = 1 — 2F. These results, as ours, rely on the behavior of the noisy prediction error ei{5) for small 5. 



C. Examples 

We now present numerical results for the Jake's model and the Gauss-Markov model, which are two widely used 
Doppler and regular processes, respectively. The classical Jakes' correlation model has the following spectrum [62], 
[54] 

^.(0 = ;^^;^^^, -F<i<F, (69) 

and auto-correlation function Jq{2'kFt). No closed-form solution is known for the prediction or filtering error. 
Under the Gauss-Markov model (i.e., auto regressive of order 1) the channel evolves in time as: 



h{T) = rh{T - 1) + \/l - r2A(r) (70) 

'"in order to have a non-interference limited system we can always use TDMA and serve one user at a time. However, in this case the 
sum-rate would grow like log(P/7Vo) instead of Mlog{P/No) as promised by the MIMO downlink with perfect CSIT. 
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Fig. 7. Achievable rate lower bounds with optimal filtering for the Jake's and Gauss-Markov models for M — 4 and i; = 10 km/hr 
(F — 0.0185 and r = 0.9966). Also shown are the rates with perfect CSI and with block-by-block estimation. 



where r is the correlation coefficient (0 < r < 1) and the innovation process A(r) is unit-variance complex 
Gaussian, i.i.d. in time. The prediction error for such model can be written in closed-form and is given by (see for 
example [32]) 



For the Jakes' model we have F = ^Tf. In all results we consider fc = 2 GHz and Tf = 1 msec. Motivated 
by the maximum-entropy principle [63], several works in wireless communication modeled channel fading as 
Gauss-Markov process with one-step correlation coefficient r = Jo(27rF), given by Jakes' model. Comparing the 
performance of the true Jakes' model with its Gauss-Markov maximum-entropy approximation, we will point out 
that the latter may be overly pessimistic for high-speed mobile terminals. 

In Fig. |7] the achievable rate lower bound with delay-free feedback (d = 0) and optimal filtering is plotted versus 
SNR for the Jakes and Gauss-Markov models, for M = 4, = 10 km/hr (F = 0.0185 and r = 0.9966), and f3i = 1. 
Filtering is seen to provide an advantage with respect to block-by-block estimation for a wide range of SNR's. For 
the Gauss-Markov model this advantage vanishes around 30 dB, whereas for Jakes' model this advantage persists 
far beyond the range of this plot. 

Using the same parameters, in Fig. [8] we plot the lower bound for one-step prediction (d = 1) versus SNR for 
V = 3 and 10 km/hr {F = 0.0056 and F = 0.0185). This figure illustrates the contrast between Doppler and regular 
processes: for Jakes' model the achieved rate is quite close to the perfect channel state information rate (although 
a slight loss in multiplexing gain is evident), whereas the rate for the Gauss-Markov model saturates at sufficiently 
high SNR due to the unpredictability inherent to the model. To further emphasize the difference in behavior, in Fig. 
|9]we plot the lower bound for one-step prediction (d = 1) versus the number of common training symbols per 
block, for P/Nq = 10 and 15 dB and t; = 10 km/hr. As (3i increases (and thus the observation noise decreases) the 
rate for Jakes' model converges to the ideal case. On the other hand, the rate for the Gauss-Markov model saturates 
at a rate strictly smaller than the ideal channel state information rate because there is strictly positive prediction 
error even if noiseless past observations (i.e., /?i — > cxd) are provided. 

In conclusion, the most noteworthy result of this analysis is that under common fading models (Doppler processes), 
both analog and digital feedback scheme achieves a potentially high multiplexing gain even with realistic, noisy 
and delayed feedback. 




(71) 
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Fig. 9. Achievable rate lower bounds with optimal one-step prediction versus /3i for M — 4 and ?; = 10 km/hr. 



VII. Conclusions 

This paper presents a comprehensive and rigorous analysis of the achievable performance of ZF beamforming 
under pilot-based channel estimation and explicit channel state feedback. We considered what we believe are the 
most relevant system aspects. In particular, the often neglected effect of explicit channel estimation at the UTs 
is taken into account, including both common training and dedicated training phases. As for the feedback, our 
closed-form bounds allow for a detailed comparison of analog and digital feedback schemes, including the effects 
of the MIMO-MAC fading channel, of digital feedback decoding errors, and of feedback delay. 

Our results build on prior work, but generalize many results and models. We have focused on the case of FDD, 
but our results easily extend to TDD systems with channel reciprocity. It is perhaps important to point out here that 
our results show that, even in the case of FDD, a system with explicit CSIT feedback can be implemented, where 
the number of training and feedback channel uses scales linearly with the number of BS antennas, and eventually 
with the downlink throughput. 

The throughput of the system analyzed here can be improved via the use of combined beamforming and user 
selection/scheduling. Simulation results show that a system with K = 10 and M = 4, with a greedy scheduling 
as proposed in [15], [32], achieves a very small gap with respect to the optimal dirty-paper coding and perfect 
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CSIT case with the same parameters. Although a clean closed-form analytical characterization of a system with 
beamforming and user selection based on imperfect channel state information appears to be difficult, recent results 
[33], [45] indicate that the dependence on CSIT quality when user selection is performed is roughly the same as 
the equal-power/no selection scenario analyzed here. 

We would like to conclude by noticing that some practically relevant extensions of the present work have been 
presented (by the same authors and by others) since the submission of this paper. In particular, the rate gap analysis 
was extended to the very relevant case of MIMO OFDM with frequency-correlated fading in [52], the optimal 
allocation of training and feedback resources is considered in [37], [38], explicit coding schemes for the CSIT 
digital feedback MIMO-MAC channel are presented in [53], and comparisons between single-user and multi-user 
MIMO (based on the bounds developed here and related approximations) are performed in [64]. 



Appendix I 
Proof of Theorem[T] 

The proof is closely inspired by that of Lemma B.0.1 of [36]. First, notice that since ^ is a function of 3?^, 
by the data-processing inequality we have that 



Then, because I{uk;yk,ak,k) = h{uk) - h{uk\yk,ak,k) and h{uk) = 
information is derived by upper bounding h{uj^\yk,ak,k) as follows: 



log (^e^^ 



a lower bound on mutual 



(a) 



h{uk\yk,ak,k) = Huk 

(b) 

< h{uk 

(c) 

< E [log (vre -E [l-Ufc - a yfcl^lflfc.fc])] 



- a yk\yk,ak,k) 

- oi yk\ak,k) 
(ire 



(72) 



where (a) holds for any deterministic function a of y^ and ak,k, (b) follows from the fact that conditioning reduces 
entropy and (c) follows by the fact that differential entropy is maximized by a Gaussian RV with the same second 
moment. Substituting ( [12] ) in ^ we have 



yk 



{ak,k + fk)uk + Ik + Zk 



(73) 



where ak,kUk and fk^k+Ik+zj^ are uncorrelated and zero-mean, even if we condition on fc, because fc, fk,ui, ■ 
are independent, zero-mean Gaussian's. Thus, we have 



E [lyfcP \ak,k] = |afc,fcpE[|nfcp] + E[\uk\^] + E [\Ik\'^\Zk,k] + Nq, 



(74) 



Choosing a that minimizes E [\uk — a yk\'^\ak,k] tightens the bound. This corresponds to setting a yk equal to the 
linear MMSE estimate of given y^ and ak,k^ 
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ak,k\ 



Using (74 1, the corresponding MMSE is given by 
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(75) 

(76) 
(77) 



Replacing (77) into (72) and using h{uk) = log (vrej^), we obtain (14i. 
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Appendix II 
Proof of Theorem[2] 

Using the lower bound on Rk from Theorem [T] we have: 



AR < E 
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(78) 
(79) 



where (a) follows by dropping the non-negative term E [|/fcP|flfc,fc] /^o- Using the fact that is spatially white and 
Vfc is selected independent of (by the ZF procedure), it follows that h^v^ is ~ C3sf(0, 1) and ak,k ~ C]Nr(0, 1— cjI). 
Direct application of Lemma |2| which is provided below, with A = P/{NqM), A = and X = |h^Vfcp, thus 
proves (b). Finally, (c) follows from the concavity of log(-) and Jensen's inequality. 

Lemma 2: If X is a non-negative random variable with E[X] = 1, for any A > and any < A < 1: 



E [log (1 + XA)] < E [log (1 + (A + (1 - X)X) A)] . 
Proof: For all < 2 < 1, define the function 

il^iz) = E [log {l + zA + {l- z)XA)] . 



(80) 
(81) 



Then (80 1 is equivalent to the inequality ip{0) < ijj{X). By the concavity of log(-) and Jensen's inequality we have 

V'(z) < log(l + (1 - 2)E[X] A) = (82) 

In particular, V'(O) < V'(l)- Moreover, ip{z) is an expectation of the composition of a concave function and a linear 
function of z, and is hence concave [65]. Thus, the concave function ^(z) for z G [0, 1] lies above the line joining 
the points (0, V'(O)) and Hence, we have '0(0) < '0(A) for A € [0, 1], which proves ([8O]). □ 



Appendix III 
Proof of TheoremH] 



Using (18 1, to compute AR^^ we only need to find E [|h^Vj| 
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(83) 

where (a) follows from (b) follows from the fact that h^Vj = V j 7^ A; by naive ZF, (c) is obtained from 
the independence of and Vj (vj is a deterministic function of {hi}j^j), and (d) follows from E[efce^] = cTgl 



and 
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Appendix IV 
Proof of Theorem[5] 

To compute the rate gap upper bound, we determine E [jh^Vjp] by writing the channel in terms of the UT 
channel estimate (which is quantized) and the UT estimation error: = + from (roll. This yields: 
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because rii. is 



zero-mean Gaussian and is independent of and Vj , (b) from the independence of the channel norm and direction 



of hfc, (c) from (35 1 and from the property [26, Lemma 2] E 
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sm 
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, and finally (d) 



by computing the expected norm of = +ftP using = \//3iP + Zk. The final result follows by using 
the above result in the expression (16 1 for the rate gap. 

Appendix V 
Proof of Theorem[6] 

We first decompose the interference variance term as 

^1 - Pe,fb)E[|hfcVjp|no fb. errors] + Pe,fbE[|h^vj|2|fb. errors] 
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where E[|h^Vjp|no fb. errors] is the same as in the error-free case and is thus given in (84i while for the case 
of feedback errors we trivially have E[|h^Vjp|fb. errors] < 1. The final result is reached by simply substituting 
B = a{M — 1) log2 and using the bound in the beta function (35). 

Appendix VI 
Proof of Theorem[7] 

Using the argument from the proof of Theorem [in] (analog FB over AWGN channel), the expected interference 
coefficient E [|h^Vjp] is is equal to the variance of the channel estimation error. This quantity must be averaged 
over the uplink channel matrix A, and thus using symmetry and ([49]), is given by 
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where mmse(/9) is defined in (51 



In order to obtain the high SNR result, we first state a closed-form expression for mmse(p) using well-known 
results from multivariate statistics (see for example [66]): 

l/p L-l k 21 

mmse(p) = —Y.Y.Y. ^M.-EilM - L + m + 1, l/p) (88) 
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where the coefficients X^. £ „j are given by 

(-1)™(2^)!(M - L + m)l (2{k - l)\ /2(M - L + I) 



X 



k,l,in 



L22fc-'"^!m!(M-L + £)! V k 



2i-m 



Based upon this we can characterize the asymptotic behavior of the product p mmse{p) for p —>■ oo. Using the 
asymptotic expansion of e^/''Ei(n, 1/p), we have 
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Appendix VII 
Proof of Theorem[8] 

We can lower bound the genie-aided rate of Theorem |3] as follows. 
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where (a) follows by dropping the non-negative terms and (b) follows by conditioning with respect to the uplink 
channel matrix A and then applying Jensen's inequality in the inner conditional expectation, (c) follows by noticing 
E[|/fcp|A] = (M - l)Po-2(A) where al{A) is defined in (49 1. Then, we obtain an upper bound of for the gap 
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between the ideal ZF rate and the genie- aided rate given by 
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where (a) follows because the term E 
(b) follows by using the same derivation that leads to ( [87] ) and ( [51] ), and the last line follows by monotonicity of 
the log, where Amin denotes the minimum eigenvalue of A'^A. 



Our goal is to show that the term in the last line of ( [94] ) is bounded. To this purpose, we write the last line of 

log (l 



([94|) as the sum of three terms, 
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For A M X M, complex Gaussian with i.i.d. zero-mean components, it is well-known that Amin is chi-squared with 
2 degrees of freedom and mean 1 [67]. Hence, the third term in ([95]) yields 
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The second term in (95 1 is bounded by a constant, independent of P/No, and finally the first term in (95 1, for high 
SNR, can be written as log + 0(1). It follows that the log(P/A'^o) terms in the first and the third terms of the 
the upper bound cancel, so that ([95) is bounded. This establishes the result. 



Appendix VIII 

Genie-Aided Upper Bound for Regular Processes with Delayed Feedback 

We show that the genie-aided upper bound of Theorem |3j is uniformly bounded for any SNR when the noiseless 
prediction error is positive. For analytical simplicity, we assume perfect common training and perfect (delayed) 
feedback. Hence, the only source of "noise" in the CSIT is due to the prediction error. We can write hfc(t) = 
hfc(t) +nfc(t), where hfc(t) is the one-step prediction of hfc(i) from its (noiseless) past, and nfc(i) is the prediction 
error. From what was stated earlier, we have that hfc(i),hfc(t) and nfc(t) are jointly complex Gaussian, i.i.d. in 
the spatial domain, with mean zero and variance per component equal to 1,1 — ei(0) and ei(0), respectively. It 
is useful to write the error as nfc(t) = ^Jel{^^{t), where A(t) (2]M"(0,I). From (20 1, the genie-aided upper 
bound is given by 
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where Vj(t) is orthogonal to hk{t). Using the fact that the upper bound is non-decreasing in P/Nq, we let P/Nq 



oo in d96l) and obtain 
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where (a) follows by applying Jensen's inequality to the first term and noticing that both h^(t)vfc(f) and A^(t)vj(t) 



are ~ e7vr(0, 1), (b) follows by expressing | A^(t)vj(i)p = | A^(t)p '"^^ h^"^;^'^' , (c) is obtained by noticing that 
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I Afc(''")P is chi-square distributed with 2M degrees of freedom and that X^j^^ - — ^'^'^J^^^^^ is beta distributed with 



parameters (M — 1, 1), and finally tp{M) is the Euler-Digamma function. 
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